

*D  
A  
6  
C  
5*

**Application Serial No.:** 09/523,877  
**Filed:** March 13, 2000

shifting all of said bits of said at least one short immediate value using said op-code and only said portion of bits to produce a shifted immediate value; and  
storing said shifted immediate value in a register.

32. The digital processor of Claim 31, wherein said at least one instruction word having a plurality of fields and said at least one instruction word having a short immediate value comprise the same instruction word(s).

10

REMARKS

Claims 1-24 were pending in the application. By this paper, Applicant has added new Claims 25-32. Accordingly, Claims 1-32 are presented herein for examination.

New Claims 25-32 relate generally to subject matter incorporated by reference into the present application from Applicant's co-pending U.S. Patent Application Serial No. 09/524,178 filed March 13, 2000 and entitled "Method and Apparatus for Loose Register Encoding Within a Pipelined Processor".

Applicant has also amended the specification to (i) include appropriate copyright notice; (ii) correct the priority claim of the application, and (iii) include information regarding related applications. Applicant submits that these amendments introduce no new matter.

20 Replacement ("clean") pages for the foregoing amendments and additions are provided herewith.

Applicant hereby specifically reserves the right to prosecute claims of different or broader scope, including those cancelled without prejudice herein, in a continuation or divisional application.

25 Applicant notes that any amendments, cancellations, or additions made herein are made solely for the purposes of more clearly and particularly describing and claiming the invention, and not for purposes of overcoming art or for reasons relating to patentability. The Examiner should infer no (i) adoption of a position with respect to patentability, (ii) change in the Applicant's position with respect to any claim or subject matter of the invention, or (iii)

**Application Serial No.: 09/523,877**  
**Filed: March 13, 2000**

acquiescence in any way to any position taken by the Examiner, based on such amendments, cancellations, or additions.

If the Examiner has any questions or comments which may be resolved over the  
5 telephone, he is requested to call the undersigned at (858) 505-1166.

Respectfully submitted,

GAZDZINSKI & ASSOCIATES

10

Dated: August 23, 2001

By: 

Robert F. Gazdzinski  
Registration No. 39,990  
3914 Murphy Canyon Road, Suite A232  
San Diego, CA 92123  
Telephone No.: (858) 505-1166  
Facsimile No.: (858) 505-1168

15

ARC.005A



PATENT

## METHOD AND APPARATUS FOR JUMP DELAY SLOT CONTROL IN A PIPELINED PROCESSOR

5

This application claims priority to U.S. Provisional Patent Application Serial No. 60/134,253 filed May 13, 1999, entitled "Method And Apparatus For Synthesizing And Implementing Integrated Circuit Designs".

10

### Copyright

A portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyright rights whatsoever.

20

### Related Applications

This application is related to co-pending U.S. Patent Application Serial No. 09/523,871 filed March 13, 2000 and entitled "Method and Apparatus for Jump Control in a Pipelined Processor", U.S. Patent Application Serial No. 09/524,179 filed March 13, 25 2000 and entitled "Method and Apparatus for Processor Pipeline Segmentation and Re-Assembly", U.S. Patent Application Serial No. 09/524,178 filed March 13, 2000 and entitled "Method and Apparatus for Loose Register Encoding Within a Pipelined Processor", and U.S. Patent Application No. 09/418,663 filed October 14, 1999, entitled "Method and Apparatus for Managing the Configuration and Functionality of a 30 Semiconductor Design".

Background of the Invention

**1. Field of the Invention**

5       The present invention relates to the field of integrated circuit design, specifically to the use of a hardware description language (HDL) for implementing instructions in a pipelined central processing unit (CPU) or user-customizable microprocessor.

**2. Description of Related Technology**

10       RISC (or reduced instruction set computer) processors are well known in the computing arts. RISC processors generally have the fundamental characteristic of utilizing a substantially reduced instruction set as compared to non-RISC (commonly known as "CISC") processors. Typically, RISC processor machine instructions are not all micro-coded, but rather may be executed immediately without decoding, thereby affording  
15 significant economies in terms of processing speed. This "streamlined" instruction handling capability furthermore allows greater simplicity in the design of the processor (as

## APPENDIX A - EXEMPLARY DELAY SLOT VHDL USED FOR SYNTHESIS

© 1999 ARC International plc. All rights reserved

```
5   library ieee.arc;
    use ieee.std_logic_1164.all;
    use arc.arcutil.all;

10  entity delay_slot is:
    PORT(
        ip2iv      ;      in  std_ulogic;
        ip2condtrue ;      in  std_ulogic;
        ip2i       ;      in  std_ulogic_vector(4 downto 0);
        ip2dd      ;      in  std_ulogic_vector(1 downto 0);
15    p2killnext ;      out  std_ulogic);

    end delay_slot;

20  architecture synthesis of delay_slot is:
    signal    ip2bch      ;      std_ulogic;
    signal    ip2rjmp     ;      std_ulogic;
    signal    ip2lpcc     ;      std_ulogic;
    signal    ip2jumping  ;      std_ulogic;
25    signal    ip2nojump   ;      std_ulogic;
    signal    ip2killnext  ;      std_ulogic;

    begin
30    ip2bch <=  '1' WHEN ip2iv = '1'      AND (ip2i = obcc OR ip2i = oblcc
                                                OR ip2i = olpcc OR ip2i =
                                                ojcc ) ELSE
                                                '0';

35    ----- Delay slot canceling logic. -----
40    -- Due to the fact that the LPcc instruction uses its condition code in a
       different way to the other branch instruction, two signals are required in the illustrated
       embodiment to specify a jump instruction which is jumping, and one for specifying a
       jump instruction which is not.

45    -- regular jump instruction --
        ip2rjmp    <= '1' WHEN ip2iv = '1'      AND (ip2i = obcc
                                                OR ip2i = oblcc
                                                OR ip2i = ojcc ) ELSE
```

```

        '0';

-- loop setup instruction --

5      ip21pcc      <= '1' WHEN ip2iv = '1' AND (ip2i = o1pcc) ELSE
        '0';

-- jumping/not jumping signals --

10     ip2jumping   <= (ip2rjmp AND ip2condtrue) OR (ip21pcc AND NOT
        ip2condtrue);
        ip2nojump   <= (ip2rjmp AND NOT ip2condtrue) OR (ip21pcc AND
        ip2condtrue);

15     -- the kill signal itself --
--
-- ip2killnext include p2iv and a full decode for a jump, so can
-- be used in pipect1 without any further decode (apart from en2 of course).

20     ip2killnext<= '1' WHEN      ((ip2dd = dmk ) AND ip2bch = '1')
                           OR       ((ip2dd = dmnd) AND ip2jumping = '1')
                           OR       ((ip2dd = dmjd) AND ip2nojump = '1')
ELSE
25     '0';

        p2killnext <= ip2killnext;

end synthesis;
30

```

## APPENDIX B – EXEMPLARY SYNTHESIS SCRIPT USED FOR DELAY SLOT SYNTHESIS

© 1999 ARC International plc. All rights reserved

```
5  /* produce result without any optimization */  
/* assuming using the LSI Logic 10k library */  
  
analyze - format vhdl - lib ARC      {ARCHOME + "/arc/vhdl/arcutil.vhdl"}  
analyze - format vhdl - lib USER     {USERDIR  + "/vhdl/delay_slot.vhdl"}  
10 elaborate delay_slot -arch "synthesis" - lib USER  
  
compile  
  
15 write - format db - hierarchy -output db/delay_slot_noopt.db  
remove_design -all  
  
/* result with logic optimization */  
  
20 elaborate delay_slot -arch "synthesis" -lib USER  
  
/* timing for inputs direct from flipflops */  
t_in = 1.33  
  
25 /* create an imaginary 20MHz clock */  
  
create_clock -name ck -period 50  
  
30 set_input_delay    20    ip2condtrue   -clock ck  
set_input_delay    t_in   ip2dd        -clock ck  
set_input_delay    t_in   ip21        -clock ck  
set_input_delay    t_in   ip2iv        -clock ck  
  
35 set_output_delay  28    p2killnext   -clock ck  
  
compile  
write -format db -hierarchy -output db/delay_slot_opt.db
```

**Revised Claims as of August 23, 2001 (Pages 22 to 28b)**

1. A method of controlling the execution of instructions within a pipelined processor, comprising:

5 providing an instruction set comprising a plurality of instruction words, each of said instruction words comprising a plurality of data bits, at least one of said words comprising a jump instruction;

assigning one of a plurality of values to at least one of said data bits of said at least one jump instruction; and

10 controlling the execution of at least one subsequent instruction within said pipeline based on said one assigned value of said at least one data bit when said at least one jump instruction is decoded.

2. The method of Claim 1, wherein the act of assigning comprises:

15 identifying a plurality of data bits within said at least one jump instruction; and assigning one of two discrete values to each of said data bits, the combination of said two discrete values representing at least three jump delay slot modes within said processor.

20 3. The method of Claim 2, wherein the act of controlling the execution based on said discrete values comprises selecting at least one mode from the group comprising:

- (i) executing said at least one subsequent instruction under all circumstances;
- (ii) executing said at least one subsequent instruction only if a jump occurs; and
- (iii) stalling the pipeline or inserting a bubble into the pipeline if a jump occurs.

25 4. The method of Claim 3, wherein said at least one jump instruction comprises a conditional branch instruction.

5. The method of Claim 1, wherein the act of controlling the execution based on said one assigned value comprises:

- 30
- (i) executing said at least one subsequent instruction under all circumstances;
  - (ii) executing said at least one subsequent instruction only if a jump occurs; and

(iii) stalling the pipeline or inserting a bubble into the pipeline if a jump occurs.

6. A processor design synthesized by the method comprising:  
inputting information to a first file to include an instruction set having at least one  
jump instruction, said at least one jump instruction comprising at least one of a plurality  
5 of jump delay modes;  
defining the location of at least one library file;  
generating a script using said first file, said library file, and user input  
information;  
running said script to create a customized description language model; and  
10 synthesizing said design based on said description language model.

7. The method of Claim 6, wherein the act of synthesizing comprises running  
synthesis scripts based on said description language model.

15 8. The method of Claim 7, further comprising the act of generating a third file  
for use with a simulation, and simulating said design using said third file.

9. The method of Claim 8, further comprising the act of evaluating the  
acceptability of the design based on said simulation.

20 10. The method of Claim 9, further comprising the acts of revising the design to  
produce a revised design, and re-synthesizing said revised design.

11. The method of Claim 1, wherein the act of inputting comprises providing a  
25 plurality of input parameters associated with said design, said parameters comprising:  
(i) a cache configuration; and  
(ii) a memory interface configuration.

12. A machine readable data storage device comprising:  
30 a data storage medium adapted to store a plurality of data bits; and

a computer program rendered as a plurality of data bits and stored on said data storage medium, said program being adapted to run on the processor of a computer system and synthesize integrated circuit logic for use in a processor having a pipeline and incorporating an instruction set having at least one branching instruction and a plurality 5 of jump modes associated therewith, said plurality of jump modes comprising at least the following:

- (i) executing a subsequent instruction within said pipeline under all circumstances;
- (ii) executing a subsequent instruction within said pipeline only if jumping occurs; and
- 10 (iii) stalling said pipeline if jumping occurs.

13. The data storage device of Claim 12, wherein said data storage medium is a compact disk-read only memory (CD-ROM), and said plurality of data bits comprises 15 an object representation of said program.

14. A digital processor comprising:  
a processor core having a multistage instruction pipeline, said core being adapted to decode and execute an instruction set comprising a plurality of instruction words;  
20 a data interface between said processor core and an information storage device; and  
an instruction set comprising a plurality of instruction words, at least one of said instruction words being a jump instruction containing data defining a plurality of jump delay slot modes, said plurality of modes controlling the execution of instructions within said instruction pipeline of said processor core in response to said at least one jump  
25 instruction word within said instruction set.

15. The processor of Claim 14, wherein said plurality of jump delay slot modes comprises at least the following modes:

- (i) executing a subsequent instruction within said pipeline under all circumstances;
- (ii) executing a subsequent instruction within said pipeline only if jumping occurs; and
- 5 (iv) stalling the pipeline if jumping occurs.

16. The processor of Claim 14, wherein said at least one jump instruction comprises a conditional branch instruction having an associated logical condition, the execution of a jump to the address within said information storage device specified by  
10 said at least one conditional branch instruction being determined by said logical condition.

17. A digital processor having at least one pipeline and an associated data storage device, wherein the execution of instructions within said at least one pipeline is  
15 controlled by the method comprising:

storing an instruction set within said data storage device, said instruction set comprising a plurality of instruction words, each of said instruction words comprising a plurality of data bits, at least one of said instruction words comprising a branch instruction directing branching to a first address within said data storage device;  
20 assigning one of a plurality of values to at least one of said data bits of said at least one branch instruction;  
decoding said at least one branch instruction including said one value;  
determining whether to execute an instruction within said pipeline in a stage preceding that of said at least one branch instruction based on said one value; and  
25 branching to said first address based on said at least one branching instruction.

18. The processor of Claim 17, wherein said data bits comprise binary (base 2) data.

19. The method of Claim 17, wherein said at least one pipeline comprises at least a three stage instruction pipeline comprising instruction fetch, decode, and execute stages.

5 20. A method of controlling the branching within the program of a multi-stage pipelined digital processor, comprising:

storing an instruction set within said data storage device, said instruction set comprising a plurality of instruction words, each of said instruction words comprising a plurality of data bits, at least one of said instruction words comprising a branch

10 instruction directing branching to a first address within said data storage device based on a first parameter;

defining a plurality of jump modes;

assigning at least one of said plurality of jump modes to at least one of said data bits of said at least one branch instruction;

15 decoding said at least one branch instruction including said at least one data bit; and

determining whether to branch to said first address based on said at least one data bit and said first parameter.

20 21. The method of Claim 20, wherein the act of defining a plurality of jump modes comprises defining the following modes:

- (i) executing a subsequent instruction under all circumstances;
- (ii) executing a subsequent instruction only if jumping occurs; and
- (iii) stalling the pipeline or inserting a bubble into the pipeline if jumping occurs.

25 22. An apparatus for synthesizing the design of logic used within a digital processor, comprising:

a central processing unit;

a data storage device operatively coupled to said central processing unit, said data

30 storage device being adapted to store and retrieve a computer program;

**Application Serial No.:**

**Filed:**

**09/523,877**

**March 13, 2000**

an input device adapted to generate signals in response to inputs from a user of said system;

a computer program, stored on said data storage device, said program being adapted to receive said signals and permit said user to:

5       input information to a first file to include an instruction set having at least one jump instruction, said at least one jump instruction comprising at least one of a plurality of jump delay modes;

      define the location of at least one library file;

      generate a script using said first file, said library file, and user input information;

10      run said script to create a customized description language model; and

      synthesize said design based on said description language model.

23.     A digital processor comprising:

      processing means having a multistage data pipeline, said processing means being adapted to decode and execute an instruction set comprising a plurality of instruction words;

      means for storing data;

      data interface means for transferring data between said processing means and said means for storing data; and

20      an instruction set comprising a plurality of instruction words, at least one of said instruction words being a jump instruction containing data defining a plurality of jump control means, said plurality of jump control means controlling the execution of instructions within said data pipeline of said processing means in response to said at least one jump instruction word within said instruction set.

25     24.     An apparatus for synthesizing the design of logic used within a digital processor, comprising:

      means for processing data;

      means for storing data, said means for storing being operatively coupled to said means for processing, and adapted to store and retrieve a computer program;

**Application Serial No.:** 09/523,877  
**Filed:** March 13, 2000

means for inputting information operatively coupled to said means for processing, said means for inputting being adapted to generate signals in response to inputs from a user of said system;

5        a computer program, stored on said data storage device, said program being adapted to receive said signals and comprising:

means for inputting information to a first file to include an instruction set having at least one jump instruction, said at least one jump instruction comprising at least one of a plurality of jump delay modes;

means for defining the location of at least one library file;

10      means for generating a script using said first file, said library file, and user input information;

means for running said script to create a customized description language model; and

means for synthesizing said design based on said description language model.

15

25.     The method of Claim 1, wherein at least one of said plurality of instruction words comprises an op-code and a plurality of fields, each of said fields comprising a plurality of bits, said at least one instruction word being encoded according to the method comprising:

20      associating a first of said fields with a first data source;

associating a second of said fields with a second data source; and

performing a logical operation using said first and second data sources as operands, said logical operation being specified by said op-code.

25      26.     The method of Claim 1, further comprising generating a long immediate constant using a single word instruction according to the method comprising:

providing an instruction word having an op-code and at least one short immediate value associated therewith, said at least one short immediate value comprising a plurality of bits;

selecting a portion of said plurality of bits of said at least one short immediate value;

shifting all of said bits of said at least one short immediate value using said op-code and only said portion of bits to produce a shifted immediate value; and

5       storing said shifted immediate value in a register.

27.     The method of Claim 25, wherein said plurality of instruction words further comprises at least one instruction word having an op-code and at least one short immediate value associated therewith, said at least one short immediate value comprising  
10       a plurality of bits, said at least one instruction word with short immediate value being used to generate a long immediate constant according to the method comprising:

selecting a portion of said plurality of bits of said at least one short immediate value;

15       shifting all of said bits of said at least one short immediate value using said op-code and only said portion of bits to produce a shifted immediate value; and

storing said shifted immediate value in a register.

28.     The method of Claim 27, wherein said at least one instruction word having a plurality of fields and said at least one instruction word having a short immediate value  
20       comprise the same instruction word(s).

29.     The digital processor of Claim 14, wherein said wherein at least one of said plurality of instruction words comprises an op-code and a plurality of fields, each of said fields comprising a plurality of bits, said at least one instruction word being encoded  
25       by:

associating a first of said fields with a first data source;

associating a second of said fields with a second data source; and

performing a logical operation using said first and second data sources as  
operands, said logical operation being specified by said op-code.

30. The digital processor of Claim 14, wherein said processor is further adapted to generate a long immediate constant using a single word instruction by:

providing an instruction word having an op-code and at least one short immediate value associated therewith, said at least one short immediate value comprising a plurality of bits;

5 selecting a portion of said plurality of bits of said at least one short immediate value;

shifting all of said bits of said at least one short immediate value using said op-code and only said portion of bits to produce a shifted immediate value; and

10 storing said shifted immediate value in a register.

31. The digital processor of Claim 29, wherein said plurality of instruction words further comprises at least one instruction word having an op-code and at least one short immediate value associated therewith, said at least one short immediate value comprising a plurality of bits, said at least one instruction word with short immediate value being used to generate a long immediate constant according to the method comprising:

selecting a portion of said plurality of bits of said at least one short immediate value;

20 shifting all of said bits of said at least one short immediate value using said op-code and only said portion of bits to produce a shifted immediate value; and

storing said shifted immediate value in a register.

32. The digital processor of Claim 31, wherein said at least one instruction word having a plurality of fields and said at least one instruction word having a short immediate value comprise the same instruction word(s).